Multimodal Person Recognition using Unconstrained Audio and Video

نویسندگان

  • Tanzeem Choudhury
  • Brian Clarkson
  • Tony Jebara
  • Alex Pentland
چکیده

We propose a person identi cation technique that can recognize and verify people from unconstrained video and audio. We do not expect fully frontal face image or clean speech as our input. Our recognition algorithm can detect and compensate for pose variation and changes in the auditory background and also select the most reliable video frame and audio clip to use for recognition. We also use 3D depth information of a human head to detect the presence of an actual person as opposed to an image of that person. Our system achieves 100% recognition and veri cation rates on natural real-time input with 26 registered clients.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Percol0 - un système multimodal de détection de personnes dans des documents vidéo (Percol0 - A multimodal person detection system in video documents) [in French]

Percol0 A multimodal person detection system in video documents The goal of the PERCOL project is to participate to the REPERE multimodal evaluation program by building a consortium combining different scientific fields (audio, text and video) in order to perform person recognition in video documents. The two main scientific challenges we are addressing are firstly multimodal fusion algorithms ...

متن کامل

Evidence Theory-Based Multimodal Emotion Recognition

Automatic recognition of human affective states is still a largely unexplored and challenging topic. Even more issues arise when dealing with variable quality of the inputs or aiming for real-time, unconstrained, and person independent scenarios. In this paper, we explore audio-visual multimodal emotion recognition. We present SAMMI, a framework designed to extract real-time emotion appraisals ...

متن کامل

PERCOLI: A Person Identification System for the 2013 REPERE Challenge

The goal of the PERCOL project is to participate to the REPERE multimodal challenge by building a consortium combining different scientific fields (audio, text and video) in order to perform person recognition in video documents. The two main scientific issues addressed by the challenge are firstly multimodal fusion algorithms for automatic person recognition in video broadcast ; and secondly t...

متن کامل

The eNTERFACE'05 Audio-Visual Emotion Database

This paper presents an audio-visual emotion database that can be used as a reference database for testing and evaluating video, audio or joint audio-visual emotion recognition algorithms. Additional uses may include the evaluation of algorithms performing other multimodal signal processing tasks, such as multimodal person identification or audio-visual speech recognition. This paper presents th...

متن کامل

Robust indoor speaker recognition in a network of audio and video sensors

Situational awareness is achieved naturally by the human senses of sight and hearing in combination. Automatic scene understanding aims at replicating this human ability using microphones and cameras in cooperation. In this paper, audio and video signals are fused and integrated at different levels of semantic abstractions. We detect and track a speaker who is relatively unconstrained, i.e., fr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998